Extended algorithm data estimator

ABSTRACT

A method of data estimation for a Time Division Duplex (TDD) Code Division Multiple Access (CDMA) system or any other system using an extended algorithm (EA) in preference to a truncated algorithm (TA). The EA avoids implementation errors by choice of proper extended matrices, and accepts the use of one piece of hardware. The EA also obviates loss of multiple signals in the tail part of each data field, and avoids errors due to transformation of a Toeplitz matrix to a circulant matrix.

CROSS REFERENCE TO RELATED APPLICATION(S)

This application claims priority from U.S. Provisional Application No. 60/409,973, filed on Sep. 9, 2002, which is incorporated by reference as if fully set forth.

FIELD OF INVENTION

The invention generally relates to data estimation for wireless communication systems. More particularly, the invention relates to an extended algorithm (EA) for data estimation in such systems.

BACKGROUND

In some proposed wireless systems, data is wirelessly transmitted block by block with a separation interval between successive blocks. This property permits the application of joint detection (JD) in receivers to suppress the inter-symbol interference (ISI) and multiple-access interference (MAI). Single user detectors (SUDs) are used to estimate data of signals that go through a single downlink channel. An advantage of the SUD is that it can be implemented efficiently by fast Fourier transform (FFT), based on the rationale that a square Toeplitz matrix can be approximated as a circulant counterpart of the same size.

When Toeplitz matrices are shortened along their longer dimensions to square matrices and replaced with their circulant counterparts, an approximation error is introduced. This error is most prevalent in the head and tail portions of the matrix. In many systems, the data associated with the head and tail portions carries system information required by the receivers, such as power control bits and the transport format combination indicator (TFCI) in the proposed third generation partnership project (3GPP) wideband code division multiple access (WCDMA) time division duplex (TDD) system. It is desirable to enhance data estimation in such systems.

SUMMARY

The present invention provides a computationally efficient and accurate implementation of a data estimator for systems, such as Frequency Division Duplex (FDD) or Time Division Duplex (TDD) Code Division Multiple Access (CDMA) systems. Described herein is a method of implementing a data estimator wherein no data of interest for estimation will be affected significantly by the circular approximation error. To achieve this, all of the square circulant matrices are extended. The advantages of the extended approach arise from two aspects: (1) the avoidance of loss of multipath signals in the tail part of each data field and (2) the avoidance of error due to the Toeplitz to circulant matrix transformation. As a result, longer discrete Fourier Transforms (DFTs) or FFTs are performed when implementing the EA. In order to minimize the required computations with DFT, the extended sizes should preferably be limited dynamically to their lower bounds according to specific data block length and channel delay spread. However, if prime factor algorithm (PFA) is used, increasing the FFT length usually does not result in the increase of computational complexity. It is to be noted that the computations can be minimized through choosing a proper FFT length in a certain range. In this case, fixed single-length FFTs are desirable by considering the longest block length and delay spread. The single-length FFT with PFA makes it possible that different lengths of data block (burst types) are supported by one algorithm only. This simplifies the implementation further since one piece of hardware is needed to cope with the single algorithm.

BRIEF DESCRIPTION OF THE DRAWING(S)

A more detailed understanding of the invention may be had from the following description of preferred embodiments given by way of example only and to be understood with reference to the accompanying drawing wherein:

FIG. 1 is a block diagram of a system used for implementing an EA with over-sampling in accordance with a preferred embodiment of the present invention;

FIGS. 2A, 2B and 2C, taken together, are a flow chart illustrating the method steps implemented by the EA of FIG. 1.

FIG. 3 is an illustration of raw block error rate (BER) of TFCI-1 versus signal to noise (SNR) in Case 1 channel (TFCI represents transport format combination indicator);

FIG. 4 is an illustration of raw BER of TFCI-2 versus signal to noise ratio (SNR) per code in Case 1 channel;

FIG. 5 is an illustration of raw BER of all bits versus SNR per code in Case 1 channel;

FIG. 6 is an illustration of raw BER of TFCI-1 versus SNR per code in Case-2 channel;

FIG. 7 is an illustration of raw BER of TFCI-2 versus SNR per code in Case-2 channel;

FIG. 8 is an illustration of all bits versus SNR per code in Case-2 channel; and

FIGS. 9A and 9B are receiver implementations using extended algorithm data detection.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT(S)

The invention is generally applicable for data estimation in CDMA systems, such as 3GPP TDD mode and time division synchronous code division multiple access (TD-SCDMA). However, the following description, as an example, refers to a TDD CDMA system model and algorithm. In the described example, if a composite spread signal is sent from one transmitter site to one reception site, the received signal, r, is composite spread signal, s, passed through a single channel, H. H is the channel response matrix. This process can be represented as r=Hs+n, where n is the noise vector. With W being the length of channel response, H takes the form per Equation 1.

$\begin{matrix} {{\underset{\_}{H} = \begin{bmatrix} h_{0} & \; & \; & \; & \; & \; & \; & \; \\ h_{1} & h_{0} & \; & \; & \mspace{11mu} & \; & \; & \; \\ . & h_{1} & . & \; & \; & \; & \; & \; \\ . & . & {\;.} & . & \; & \; & \; & \; \\ h_{W - 1} & . & \; & . & {\;.} & \; & \; & \; \\ \; & h_{W - 1} & \; & \; & . & . & \; & \; \\ \; & \; & . & \; & \; & . & . & \; \\ \; & \; & \; & . & \; & \; & . & h_{0} \\ \; & \; & \; & \; & . & \; & \; & h_{1} \\ \; & \; & \; & \; & \; & . & \; & . \\ \; & \; & \; & \; & \; & \; & . & . \\ \; & \; & \; & \; & \; & \; & \; & h_{W - 1} \end{bmatrix}},} & {{Equation}\mspace{14mu} 1} \end{matrix}$ H is of size (L+W−1) by L. L denotes the number of chips in the time period of interest, such as a data field (block). The composite spread signal s can be expressed as s=Cd, where the symbol vector d and the code matrix C take the form per Equation 2. d=(d₁, d₂, . . . , d_(KN) _(s) )^(T)  Equation 2 T denotes the transposition and C is per Equation 3. C=└C ⁽¹⁾, C ⁽²⁾, . . . , C ^((K))┘  Equation 3 Each C^((k)) is per Equation 4.

$\begin{matrix} {{\underset{\_}{C}}^{(k)} = {\begin{bmatrix} c_{1}^{(k)} & \; & \; & \; & \; & \; & \; & \; \\ . & \; & \; & \; & \; & \; & \; & \; \\ c_{Q}^{(k)} & \; & \; & \; & \; & \; & \; & \; \\ . & c_{1}^{(k)} & \; & \; & \; & \; & \; & \; \\ \; & . & \; & \; & \; & \; & \; & \; \\ \; & c_{Q}^{(k)} & . & \; & \; & \; & \; & \; \\ \; & \; & \; & . & \; & \; & \; & \; \\ \; & \; & \; & \; & . & \; & \; & \; \\ \; & \; & \; & \; & \; & . & \; & \; \\ \; & \; & \; & \; & \; & \; & . & c_{1}^{(k)} \\ \; & \; & \; & \; & \; & \; & \; & . \\ \; & \; & \; & \; & \; & \; & \; & c_{Q}^{(k)} \end{bmatrix}.}} & {{Equation}\mspace{14mu} 4} \end{matrix}$

Q, K, and N_(S)(=L/Q) denote the spread factor (SF), the number of active codes, and the number of symbols carried on each channelization code, respectively.

The SUD comprises two stages: (a) channel equalization and (b) de-spreading. In the first stage, the composite spread signal s is estimated from r=Hs+n, preferably through a minimum mean squared error (MMSE) equalizer or a zero forcing solution.

A MMSE Equalizer is Per Equation 5.

$\begin{matrix} \begin{matrix} {\underset{\_}{\hat{s}} = {\left\lbrack {R_{H} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}^{H}\underset{\_}{r}}} \\ {{= {{\left\lbrack {R_{H} + {\sigma^{2}I}} \right\rbrack^{- 1}R_{H}\underset{\_}{s}} + {\left\lbrack {R_{H} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}^{H}\underset{\_}{n}}}},} \end{matrix} & {{Equation}\mspace{14mu} 5} \end{matrix}$ A Zero Forcing Solution is Per Equation 6. ŝ=R _(H) ⁻¹ H ^(H) r,  Equation 6 I is the identity matrix, R_(H)=H ^(H) H is a square Toeplitz matrix of size L, per Equation 7.

$\begin{matrix} {R_{H} = \begin{bmatrix} R_{0} & R_{1} & \cdots & R_{W - 1} & 0 & 0 & \cdots & \; & \; & \; & \; & 0 \\ R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} & 0 & ⋰ & \; & \; & \; & \; & \; \\ \; & R_{1}^{*} & ⋰ & R_{1} & \; & R_{W - 1} & ⋰ & 0 & \; & \; & \; & \; \\ R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} & \; & ⋰ & 0 & 0 & \; & \; & \; \\ 0 & R_{W - 1}^{*} & \; & R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} & 0 & \; & \; & \; \\ 0 & 0 & ⋰ & \; & R_{1}^{*} & ⋰ & ⋰ & \; & R_{W - 1} & ⋰ & 0 & \vdots \\ \vdots & 0 & ⋰ & R_{W - 1}^{*} & \; & ⋰ & ⋰ & R_{1} & \; & ⋰ & 0 & 0 \\ \; & \; & ⋰ & 0 & R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} & \; & R_{W - 1} & 0 \\ \; & \; & \; & 0 & 0 & R_{W - 1}^{*} & \; & R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} \\ \; & \; & \; & \; & 0 & 0 & ⋰ & \; & R_{1}^{*} & ⋰ & R_{1} & \; \\ \; & \; & \; & \; & \; & 0 & ⋰ & R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} \\ 0 & \; & \; & \; & \; & \cdots & 0 & 0 & R_{W - 1}^{*} & \cdots & R_{1}^{*} & R_{0} \end{bmatrix}} & {{Equation}\mspace{14mu} 7} \end{matrix}$ ‘*’ denoting the conjugate operation. In the second stage, a simple de-spreading process is performed to estimate the symbol sequence d, {circumflex over (d)}, per Equation 8. {circumflex over (d)}=C ^(H) ŝ.  Equation 8

To implement Equation 5 efficiently, it is desirable to approximate the algorithm properly. To do this, first the Toeplitz matrix H is extended from the size of L+W−1 by L to the size of Lm+W−1 by Lm and then the square matrix R_(H) is extended from the size of L to Lm with Lm≧L+W−1, while keeping the banded and Toeplitz structure of the matrices intact. The vector r is extended to the length Lm by zero padding, if the length of r is less than Lm. The vectors s and n are effectively automatically extended, due to the padding extension of the other vector/matrix. The extended versions of the matrices and vector H, R_(H), s, r and n are denoted as H _(E), R_(E), s _(E), r _(E) and n _(E), respectively. R_(E) is as follows, R_(E)=H _(E) ^(H) H _(E).

The last Lm-L elements of s _(E) can all be deemed to be zero, which is essential in understanding the avoidance of the implementation error. With these notations, Equation 5 can be rewritten as Equation 9. ŝ _(E) =[R _(E)+σ² I] ⁻¹ H _(E) ^(H) r _(E),  Equation 9 r_(E) is as follows r _(E)=H _(E) s _(E)+n _(E). Because Equation 9 is the extended version of Equation 5, there should not be any differences between them if only the first L elements of ŝ _(E) are considered. The last W−1 rows of the H _(E) are cut to get a new square matrix of size Lm, denoted by H _(s). Suppose R_(Cir) and H _(Cir) represent the circular counterparts of R_(E) and H _(s), respectively. From R_(E) and H _(s), R_(Cir) and H _(Cir) are constructed as per Equations 10 and 11.

$\begin{matrix} {R_{Cir} = \begin{bmatrix} R_{0} & R_{1} & \cdots & R_{W - 1} & 0 & 0 & \cdots & \; & \; & R_{W - 1}^{*} & . & R_{1}^{*} \\ R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} & 0 & ⋰ & \; & \; & \; & . & . \\ \; & R_{1}^{*} & ⋰ & R_{1} & \; & R_{W - 1} & ⋰ & 0 & \; & \; & \; & R_{W - 1}^{*} \\ R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} & \; & ⋰ & 0 & 0 & \; & \; & \; \\ 0 & R_{W - 1}^{*} & \; & R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} & 0 & \; & \; & \; \\ 0 & 0 & ⋰ & \; & R_{1}^{*} & ⋰ & ⋰ & \; & R_{W - 1} & ⋰ & 0 & \vdots \\ \vdots & 0 & ⋰ & R_{W - 1}^{*} & \; & ⋰ & ⋰ & R_{1} & \; & ⋰ & 0 & 0 \\ \; & \; & ⋰ & 0 & R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} & \; & R_{W - 1} & 0 \\ \; & \; & \; & 0 & 0 & R_{W - 1}^{*} & \; & R_{1}^{*} & R_{0} & ⋰ & \; & R_{W - 1} \\ R_{W - 1} & \; & \; & \; & 0 & 0 & ⋰ & \; & R_{1}^{*} & ⋰ & R_{1} & \; \\ . & . & \; & \; & \; & 0 & ⋰ & R_{W - 1}^{*} & \; & ⋰ & R_{0} & R_{1} \\ R_{1} & . & R_{W - 1} & \; & \; & \cdots & 0 & 0 & R_{W - 1}^{*} & \cdots & R_{1}^{*} & R_{0} \end{bmatrix}} & {{Equation}\mspace{14mu} 10} \\ {{\underset{\_}{H}}_{Cir} = \begin{bmatrix} h_{0} & \; & \; & \; & \; & \; & \; & h_{W - 1} & . & h_{1} \\ h_{1} & h_{0} & \; & \; & \; & \; & \; & \; & . & . \\ . & h_{1} & . & \; & \; & \; & \; & \; & \; & h_{W - 1} \\ h_{w - 1} & . & \; & . & \; & \; & \; & \; & \; & \; \\ \; & h_{W - 1} & . & \; & . & \; & \; & \; & \; & \; \\ \; & \; & . & . & \; & . & \; & \; & \; & \; \\ \; & \; & \; & . & . & \; & h_{0} & \; & \; & \; \\ \; & \; & \; & \; & . & . & h_{1} & . & \; & \; \\ \; & \; & \; & \; & \; & . & . & . & h_{0} & \; \\ \; & \; & \; & \; & \; & \; & h_{W - 1} & . & h_{1} & h_{0} \end{bmatrix}} & {{Equation}\mspace{14mu} 11} \end{matrix}$ R_(CIR) and H_(CIR) both of them are square matrices of size Lm. Because the last Lm-L elements of s _(E) are zero, r _(E)=H _(S) S _(E)+n _(E). Replacing R_(E), H _(E) ^(H) and r _(E) by R_(Cir), H _(Cir) ^(H) and H _(S) S _(E)+n _(E), respectively, in Equation 9, results in Equation 12.

$\begin{matrix} \begin{matrix} {{\underset{\_}{\overset{\sim}{s}}}_{E} = {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{r}}_{E}}} \\ {= {{\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{H}}_{S}{\underset{\_}{s}}_{E}} + {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{n}}_{E}}}} \end{matrix} & {{Equation}\mspace{14mu} 12} \end{matrix}$ Assuming H _(s)=H _(Cir)−HΔ (HΔ is an error matrix), Equation 12 can be expressed as Equation 13.

$\begin{matrix} \begin{matrix} {{\underset{\_}{\overset{\sim}{s}}}_{E} = {{\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}R_{Cir}{\underset{\_}{s}}_{E}} + {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{n}}_{E}} -}} \\ {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}y} \end{matrix} & {{Equation}\mspace{14mu} 13} \end{matrix}$ y=H _(Cir) ^(H) H _(Δ) s _(E) is a column vector of length Lm and the error matrix HΔ is per Equation 14.

$\begin{matrix} {{\underset{\_}{H}}_{\bigtriangleup} = \begin{bmatrix} \; & \; & \; & \; & \; & h_{W - 1} & \cdots & h_{1} \\ \; & \; & \; & \; & \; & \; & ⋰ & \vdots \\ \; & \; & \; & \; & \; & \; & \; & h_{W - 1} \\ \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & \; & \; \end{bmatrix}} & {{Equation}\mspace{14mu} 14} \end{matrix}$ The non-zero elements are located in a triangular area between the first W−1 rows and the last W−1 columns. Without the third term, Equation 13 is very similar to Equation 5 in function when the first L elements are considered. The vector y is evaluated. According to the structures of the matrices H _(Cir) ^(H) and H _(Δ), the matrix X=H _(Cir) ^(H) H _(Δ) is a square matrix of size Lm with the structure per Equation 15.

$\begin{matrix} {\underset{\_}{X} = \begin{bmatrix} \; & \; & \; & \; & \; & \; & x & \ldots & x & x \\ \; & \; & \; & \; & \; & \; & \; & x & . & x \\ \; & \; & \; & \; & \; & \; & \; & \; & . & . \\ \; & \; & \; & \; & \; & \; & \; & \; & \; & x \\ \; & \; & \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & \; & \; & \; & \; \\ \; & \; & \; & \; & \; & \; & x & .. & x & x \\ \; & \; & \; & \; & \; & \; & . & . & . & . \\ \; & \; & \; & \; & \; & \; & x & .. & . & x \\ \; & \; & \; & \mspace{11mu} & \; & \; & x & .. & x & x \end{bmatrix}} & {{Equation}\mspace{14mu} 15} \end{matrix}$ The non-zero elements, denoted by ‘x’, are located only in two areas: (1) a triangular one between the first W−1 rows and the last W−1 columns and (2) a square one between the last W−1 rows and columns. Because the last (Lm-L) elements of s _(E) are zero, all elements of y=H _(Cir) ^(H) H _(Δ) S _(E)=X s _(E) are zero if Lm≧L+W−1. When Lm≧L+W−1, Equation 16 results.

$\begin{matrix} \begin{matrix} {{\underset{\_}{\overset{\sim}{s}}}_{E} = {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{r}}_{E}}} \\ {{= {{\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}R_{Cir}{\underset{\_}{s}}_{E}} + {\left\lbrack {R_{Cir} + {\sigma^{2}I}} \right\rbrack^{- 1}{\underset{\_}{H}}_{Cir}^{H}{\underset{\_}{n}}_{E}}}},} \end{matrix} & {{Equation}\mspace{14mu} 16} \end{matrix}$ Equation 16 is a good approximation of Equation 5, when taking only the first L estimates into account. The first portion of Equation 16 is referred to as the EA. Similarly, the head and tail parts of a data field are affected significantly by replacing Toeplitz matrices with their circulant counterparts in Equation 5 directly without matrix extension. The implementation algorithm without matrix extension is called the truncated algorithm (TA). With TA, Equations 9 to 15 are still valid, except that Lm=L. The reason consists of two aspects: First, when replacing H in Equation 5 with H _(Cir) of size L, the length of the received signal vector r is preferably limited to be L. This results in the loss of multipath signals of the data in the tail part of a data field. Therefore, the estimates of the affected data become very poor. Second, when Lm=L, y is a column vector of length L, in which the first and last W−1 elements are non-zero elements. Because B_(Cir)=R_(Cir)+σ²I is of banded structure along the diagonal, the inverse of B_(Cir) has approximately the same structure. Therefore, the relative large values of the column vector z=B_(Cir) ⁻¹ y are located in the first and last W−1 columns and affect the estimates of ŝ both in the head and tail areas. For the second reason, the number of the affected estimates depends on the channel response length W. The larger the delay spread of a channel (W) is, the more the estimates are affected.

Additionally, the implementation algorithm given by Equation 16 can be extended to support over-sampling. With over-sampling, the effect of timing error is mitigated. Assuming sampling rate is M times of chip rate, M received signal vectors, denoted by r _(E) ^((m)) for m=1,2, . . . ,M, are available. However, the time interval between two successive samples in each r _(E) ^((m)) is the chip duration. Similarly, there are also M sets of channel response, denoted by h ^((m))=(h_(0,m),h_(1,m), . . . h_(W−,m)) for m=1,2, . . . ,M. With these channel responses, a total of 2M circulant matrices H _(Cir,m) and R_(Cir,m) for m=1,2, . . . ,M can be constructed. Accordingly, the implementation algorithm with over-sampling can be written as per Equation 17.

$\begin{matrix} {{\overset{\sim}{\underset{\_}{s}}}_{E} = {\left\lbrack {\sum\limits_{m = 1}^{M}\left( {R_{{Cir},m} + {\frac{1}{M}\sigma_{m}^{2}I}} \right)} \right\rbrack^{- 1}{\sum\limits_{m = 1}^{M}{{\underset{\_}{H}}_{{Cir},m}^{H}{\underset{\_}{r}}_{E}^{(m)}}}}} & \text{Equation~~17} \end{matrix}$ σ_(m) ² is the noise variance corresponding to the mth input vector r _(E) ^((m)).

Before implementation, the value of Lm is determined. Because Lm is larger than L+W−1, Lm is chosen as per Equation 18. Lm=max {L}+max {W}+ε,  Equation 18 max {.} is the maximum value of {.} and ε is a positive integer used to let Lm a good length for FFT implementation. For example, in UTRA wideband TDD system (WTDD), max {L}=1104, max {W}=114. ε is chosen to be equal to 14 to let Lm=1232. With this length, FFT can be performed very efficiently by PFA because 1232 can be factored as 1232=7×11×16. With complex input, the real multiplies and adds required by the 1232-point FFT are 8836 and 44228, respectively. From Equation 18, Lm depends on specific system design. However, the implementation approach described herein is applicable to any other TDD systems such as UTRA narrowband TDD system (TD-SCDMA).

In the following, the preferred implementation procedure of Equation 17 is described as method steps under the assumption that the FFT length P equals the selected Lm.

The first column g of the circulant matrix

$\sum\limits_{m = 1}^{M}\left( {R_{{Cir},m} + {\frac{1}{M}\sigma_{m}^{2}I}} \right)$ is computed based on the estimated channel response and noise power, yielding Equation 19.

$\begin{matrix} {\underset{\_}{g} = {\sum\limits_{m = 1}^{M}{\left( {{R_{0,m} + {\frac{1}{M}\sigma_{m}^{2}}},R_{1,m}^{*},\ldots\mspace{11mu},R_{{W - 1},m}^{*},0,\ldots\mspace{11mu},0,R_{{W - 1},m},{\ldots\mspace{11mu} R_{1,m}}} \right)^{T}.}}} & \text{Equation~~19} \end{matrix}$

The circulant matrix

$\sum\limits_{m = 1}^{M}\left( {R_{{Cir},m} + {\frac{1}{M}\sigma_{m}^{2}I}} \right)$ in the FFT domain is decomposed, yielding Equation 20.

$\begin{matrix} {{{\sum\limits_{m = 1}^{M}\left( {R_{{Cir},m} + {\frac{1}{M}\sigma_{m}^{2}I}} \right)} = {D_{P}^{- 1}\Lambda_{R}D_{P}}},} & \text{Equation~~20} \end{matrix}$ D_(P) and D_(p) ⁻¹ are the P-point FFT and inverse FFT (IFFT) matrices defined as per Equation 21.

$\begin{matrix} {{{D_{P}\underset{\_}{x}} = {\sum\limits_{n = 0}^{P - 1}{{x(n)}{\mathbb{e}}^{{- j}\frac{2\pi\;{kn}}{P}}\mspace{14mu}\text{and}}}}{{{D_{P}^{- 1}\underset{\_}{x}} = {\frac{1}{P}{\sum\limits_{n = 0}^{P - 1}{{x(n)}{\mathbb{e}}^{j\;\frac{2\pi\;{kn}}{P}}}}}},{{\text{for}\mspace{14mu} k} = 0},1,\ldots\mspace{11mu},{P - 1},}} & \text{Equation~~21} \end{matrix}$ Λ_(R) is a diagonal matrix of size P, whose diagonal is D_(p) g. Λ_(R) is denoted as Λ_(R)=diag(D_(p) g). The relation between D_(p) ⁻¹ and D_(p) is D_(p) ⁻¹=(1/P)D_(p)*.

The circulant matrix H _(Cir,m) is decomposed in the FFT domain, yielding Equation 22. H _(Cir,m) =D _(p) ⁻¹Λ_(H) _(m) D _(P),  Equation 22 Λ_(H) _(m) is a diagonal matrix of size P, whose diagonal is D_(p) u _(m) with u _(m)=[h_(0,m),h_(1,m), . . . ,h_(w−1,m),0, . . . ,0]^(T) being the first column of H _(Cir,m).

The received signal vector r ^((m)) is reconstructed by zero padding to get the extended signal vector r _(E) ^((m)) of length P.

The composite spread signal vector {tilde over (s)} _(E) is computed yielding Equation 23 or Equation 24 in the frequency domain.

$\begin{matrix} {{\underset{\_}{\overset{\sim}{s}}}_{E} = {{\left\lbrack {\sum\limits_{m = 1}^{M}\left( {R_{{Cir},m} + {\sigma_{m}^{2}I}} \right)} \right\rbrack^{- 1}{\sum\limits_{m = 1}^{M}{{\underset{\_}{H}}_{{Cir},m}^{H}{\underset{\_}{r}}_{E}^{(m)}}}} = {D_{P}^{- 1}\Lambda_{R}^{- 1}{\sum\limits_{m = 1}^{M}{\Lambda_{H_{m}}^{*}D_{P}{\underset{\_}{r}}_{E}^{(m)}}}}}} & \text{Equation~~23} \\ {{{D_{P}{\overset{\sim}{\underset{\_}{s}}}_{E}} = {{\left( {\sum\limits_{m = 1}^{M}{\left( {D_{P}{\underset{\_}{u}}_{m}} \right)^{*} \otimes \left( {D_{P}{\underset{\_}{r}}_{E}^{(m)}} \right)}} \right)/\left( {D_{P}\underset{\_}{g}} \right)}\mspace{14mu}\text{and}}}{{{\overset{\sim}{\underset{\_}{s}}}_{E} = {D_{P}^{- 1}\left\{ {D_{P}{\overset{\sim}{\underset{\_}{s}}}_{E}} \right\}}},}} & \text{Equation~~24} \end{matrix}$ The operators {circle around (x)} and/denote the vector multiplication and division performed on element-by-element basis, respectively. The last P-L elements of {tilde over (s)} _(E) are rounded off to get another vector {tilde over (s)} of length L.

The composite spread signal ŝ is despread, yielding {tilde over (s)} _(E).

FIG. 1 is a block diagram of a system 100. For an over-sampled system, M sampled sequences are processed, r ⁽¹) . . . r ^((M)) and h ⁽¹) . . . h ^((M)). For a chip rate sampled sequence, only one sampled sequence is processed, r ⁽¹) and h ⁽¹⁾. The system 100 receives the signals r ⁽¹) r ^((M)) at inputs 105 ₁ . . . 105 _(M)(105) and receives the channel impulse response h ₍₁) . . . h ^((M)) at inputs 110 ₁ . . . 110 _(M)(110). The received signals r ⁽¹) . . . r ^((M)) are zero padded in the tail by zero padding devices 115 ₁ . . . 115 _(M)(115) until the length of each sequence achieves length Lm. The extended sequences after zero padding are denoted as r _(E) ⁽¹) . . . r _(E) ^((M)) which exits block 115 via outputs 120 ₁ . . . 120 _(M)(120). The channel impulse responses h ⁽¹) . . . h ^((M)) are zero padded in the tail by zero padding devices 125 ₁ . . . 125 _(M)(125) until the length of the extended sequence achieves length Lm. The extended sequences after zero padding are denoted as u ₁ . . . u _(M) which exits the zero padding devices 125 via outputs 130 ₁ . . . 130 _(M)(130). DFT or FFT blocks 135 ₁ . . . 135 _(M) (135) receive the outputs 120 from zero padding devices 115 and perform DFT or FFT on r _(E) ⁽¹) r _(E) ^((M)), F(r _(E) ⁽¹⁾) . . . F(r _(E) ^((M))). DFT or FFT blocks 140 ₁ . . . 140 _(M)(140) receive the outputs 130 from zero padding devices 125 and perform DFT or FFT on u ₁ . . . u _(M), F(u ₁) . . . F(u _(M)). Conjugate devices 145 ₁ . . . 145 _(M) (145) conjugate F(u ₁) . . . F(u _(M)), F(u ₁)* . . . F(u _(M))*. Element-to-element multipliers 150 ₁ . . . 150 _(M) (150) multiply the sequences F(r _(E) ⁽¹⁾) . . . F(r _(E) ^((u))) and F(u ₁)* . . . F(u _(M))*, F(r _(E) ⁽¹⁾)·F(u ₁)* . . . F(r _(E) ^((M)))·F(u _(M1))*.

All of the M sampled sequence results are added element-to-element by adder 175 with

${\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}},\;{M = 1},2,{\ldots\mspace{14mu}{M.}}$ A channel correlation vector g is generated by a channel correlation generator 180 using extended channel response sequences u ₁, . . . u _(M), with

$\underset{\_}{g} = {\sum\limits_{m = 1}^{M}{{\underset{\_}{g}}^{(m)}.}}$

Using a MMSE algorithm, a noise variance σ_(m) ² is added to the first element of vector g ^((m)). Vector g ^((m)) is generated using u _(m). The i-th element of the vectors g ^((m)) for the m-th sampled sequence is computed by: circulating the conjugate vector u _(m)* by downshifting i−1 elements and multiplying the shifted vector u _(m)* by the vector u _(m) such that g ^((m))(i)=u _(m,(i−1)shifts) ^(H) u _(m), m=1,2, . . . , M. A DFT or FFT device 185 performs a DFT or FFT on channel correlation vector g, F(g). Divider 190 divides element-by-element the output of adder 175 by the output of DFT or FFT device 185, such that

$\frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)}.$ Inverse DFT or inverse FFT device 194 is performed on the output of divider 190, such that

${F^{- 1}\left( \frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)} \right)}.$ The output of the inverse DFT or inverse FFT device 194 is an estimate of the composite spread signal ŝ. Despreader 198 despreads the output of inverse DFT or inverse FFT device 194 to obtain the estimated data symbols {circumflex over (d)}.

Referring to FIGS. 2A, 2B and 2C, the procedure for performing an EA in accordance with a preferred embodiment of the present invention is described as follows:

In step 205, system 100 receives signal r ⁽¹) at input 105 and receives the channel impulse response h ⁽¹) at input 110.

In step 210, the received signal r ⁽¹) is zero padded in the tail by zero padding device 115 until the length of sequence achieves length Lm. The extended sequence after zero padding is denoted as r _(E) ⁽¹) which exits block 115 via output 120.

In step 215, the channel impulse response h ⁽¹) is zero padded in the tail by zero padding device 125 until the length of the extended sequence achieves length Lm. The extended sequence after zero padding is denoted as u ₁ which exits the zero padding device 125 via output 130.

In step 220, DFT or FFT block 135 receives the output 120 from zero padding device 115 and performs DFT or FFT on r _(E) ⁽¹) such that F(r _(E) ⁽¹⁾). DFT or FFT block 140 receives the output 130 from zero padding device 125 and performs DFT or FFT on u ₁, F(u ₁).

In step 225, conjugate device 145 conjugates F(u ₁), F(u ₁)*.

In step 230, element-to-element multiplier 150 multiplies the sequences F(r _(E) ⁽¹⁾) and F(u ₁)* producing F(r _(E) ⁽¹⁾)·F(u ₁)*.

In step 235, for over-sampling system with M sampled sequences, steps 210 to 230 are repeated for sampled sequences 2, . . . ,M, F(r _(E) ^((m)))·F(u _(m))*, for m=2, . . . , M.

In step 240, all of the M sampled sequence results obtained in steps 230 and 235 are added element-to-element by adder 175,

${\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}},$ for m=1,2, . . . ,M.

In step 245, a channel correlation vector g is generated by a channel correlation generator 180 using extended channel response sequences u ₁, . . . , u _(M), such that

$\underset{\_}{g} = {\sum\limits_{m = 1}^{M}{{\underset{\_}{g}}^{(m)}.}}$

In step 250, a DFT or FFT 185 performs DFT or FFT on channel correlation vector g, F(g).

In step 255, divider 190 divides element-by-element the result in step 240 by the result in step 250,

$\frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)}.$

In step 260, an inverse DFT or inverse FFT 194 is performed on the result of step 255,

${F^{- 1}\left( \frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)} \right)},$ producing the estimated composite spread signal, Ŝ.

In step 265, despreader 198 despreads the output of step 260 to obtain the estimated data symbols {circumflex over (d)}.

In simulation, the described model is based on K=12, and that codes are transmitted with equal code power and the effects of midamble signal on the second data field have been cancelled completely. Each code has a SF=16. A total 1104 chips in a data field are assumed (burst type 2 in WTDD). Because there are two data fields in a timeslot in WTDD, the last 8 bits (4 complex symbols) of the first data field, and the first 8 bits of the second data field are defined as TFCI-1 and TFCI-2, respectively. Two algorithms, truncated algorithm (TA) and extended algorithm (EA), are employed. The raw bit error rates of TFCI-1, TFCI-2 as well as all bits are evaluated for both EA and TA with chip rate sampling. 1000 timeslots are accumulated for each SNR point. The simulations are run over the WG4 Case-1 and 2 channels.

FIGS. 3 and 4 present the performance of TFCI-1 and TFCI-2 in WG4 Case-1 channel when EA and TA are used. A significant performance gap between EA and TA is found as shown in FIG. 3. Because TFCI-1 locates in the tail part of the first data field, the performance degradation of the TFCI-1 with TA results from two reasons: (1) loss of TFCI-1 multipath signals and (2) Toeplitz to circulant matrix replacement in the implementation because the channel response length W is small (W=4) for WG4 Case-1 channel. This conclusion is confirmed by the results shown in FIG. 4. Because the performance in FIG. 4 is for the TFCI-2 which locates in the head part of the second data field, the most probable reason that affects its performance with TA must be the second one: matrix replacement. From FIG. 4, it can be seen that the performance of TFCI-2 with EA and TA is nearly identical. This implies that the estimate error introduced to the TA through the matrix replacement is very limited because of the small value of W for WG4 Case-1 channel. For example, when W=4, only a quarter of the first symbol is affected because of the relationship SF=16.

FIG. 5 shows the raw BER of all bits when EA and TA are assumed. Comparing with EA, the major contributor to the loss of raw BER of all bits with TA is the TFCI-1 in each timeslot.

FIGS. 6 and 7 show the performance of TFCI-1 and TFCI-2 in WG4 Case-2 channel when EA and TA are adopted. Case-2 channel differs from Case-1 channel by much larger delay spread (W=46) as well as stronger power of multipath signals. From FIG. 6, the TFCI-1 with TA is seen to be almost destroyed both due to the loss of its multipath signals and matrix replacement. In FIG. 7, the performance of TFCI-2 with TA is still much poorer than that with EA, which is due to the matrix replacement only. In this case, W=46 and hence the first three symbols (six bits) will be affected significantly. FIG. 8 presents the raw BER of all bits for EA and TA in Case-2 channel.

When a truncated algorithm (TA) is used in the implementation, the data in the head and tail parts of a data field are affected significantly by two aspects: the lost information of multipath signals due to cutting the channel response matrix to be square and the error due to replacing Toeplitz matrices with circulant ones. To overcome this problem, an EA is used. The EA avoids implementation errors by choosing the size of the extended matrices properly. To implement the EA with DFT, a dynamic-length EA is desirable, while for an EA with FFT of PFA, a fixed-length approach is more appropriate. In the fixed-length approach, the computation complexity can be minimized through choosing a proper PFA length in a certain range. The fixed-length EA makes different data block lengths (burst types) be supported by one algorithm only. The fixed length EA simplifies the implementation further since one piece of hardware is needed to cope with the single algorithm. Simulation results show that the performance of the EA is much better than that of the TA, especially for the data in the head and tail parts of data fields.

The invention can be implemented at a base station or wireless transmit/receive unit (WTRU). Hereafter, a wireless transmit/receive unit (WTRU) includes but is not limited to a user equipment, mobile station, fixed or mobile subscriber unit, pager, or any other type of device capable of operating in a wireless environment. When referred to hereafter, a base station includes but is not limited to a base station, Node-B, site controller, access point or other interfacing device in a wireless environment.

FIGS. 9A and 9B are receiver implementations using extended algorithm data detection. Referring to FIG. 9A, radio frequency (RF) signals are received by an antenna 300. A sampling device 305 produces a chip rate received vector r. A channel estimation device 325 determines a channel impulse response h for the received vector. A single user detection device 310 uses the received vector r and the channel impulse response h to estimate the data vector d using the extended algorithm. The received vector r is processed by a channel equalizer 315 using the channel impulse response h to determine a spread vector s. A despreader 320 using the transmission codes C despreads the spread vector s to estimate the data vector d.

Referring to FIG. 9B, RF signals are received by an antenna 300. A sampling device 305 samples the received signal at a multiple M of the chip rate, producing M received vector sequences r ₁ . . . r _(m). A channel estimation device 325 determines a channel impulse response h ₁ . . . h _(m) corresponding to each received vector r ₁ . . . r _(m). A single user detection device 310 uses the received vector sequences r ₁ . . . r _(m) and the channel impulse responses h ₁ h _(m) to estimate the data vector d using the extended algorithm. The received vectors r ₁ . . . r _(m) are processed by a channel equalizer 315 using the channel impulse responses h ₁ . . . h _(m) to determine a spread vector S. A despreader 320 using the transmission codes C despreads the spread vector S to estimate the data vector d.

While the present invention has been described in terms of the preferred embodiment, other variations which are within the scope of the invention as outlined in the claims below will be apparent to those skilled in the art. 

1. A method of performing an extended algorithm (EA) with over-sampling, the method comprising: (a) receiving a signal r ⁽¹⁾ at a first input and a channel impulse response h ⁽¹⁾ at a second input; (b) zero padding the received signal r ⁽¹⁾ in the tail until the length of sequence achieves length Lm and denoting the extended sequence after zero padding as r _(E) ⁽¹⁾; (c) zero padding the channel impulse response h ⁽¹⁾ in the tail until the length of the extended sequence achieves length Lm and denoting the extended sequence after zero padding as u ₁; (d) performing a discrete Fourier Transform (DFT) or fast Fourier transform (FFT) on r _(E) ⁽¹⁾ such that F(r _(E) ⁽¹⁾; (e) performing DFT or FFT on u ₁ such that F(u ₁); (f) conjugating F(u ₁) such that F(u ₁)*; (g) multiplying the sequences F(r _(E) ⁽¹⁾⁾ and F(u ₁)* such that F(r _(E) ⁽¹⁾)·F(u ₁)*, wherein for M sampled sequences, steps (b)-(g) are repeated for sampled sequences 2, . . .,M such that F(r _(E) ^((m)))·F(u _(m))*, m=2, . . .,M.
 2. The method of claim 1 wherein all of the M sampled sequence results obtained in steps (b)-(g) are added element-to-element such that ${\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}},\mspace{11mu}{M = 1},2,\ldots\mspace{14mu},{M.}$
 3. The method of claim 2 further comprising: (h) generating a channel correlation vector g using extended channel response sequences u ₁, . . . , u _(M) such that ${\underset{\_}{g} = {\sum\limits_{m = 1}^{M}{\underset{\_}{g}}^{(m)}}};$ (i) performing DFT or FFT on channel correlation vector g such that F(g); (j) dividing element-by-element the result in step (g) by the result in step (i) such that $\frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)};$ (k) performing an inverse DFT or inverse FFT on the result of step (j) such that ${F^{- 1}\left( \frac{\sum\limits_{m = 1}^{M}{{F\left( {\underset{\_}{r}}_{E}^{(m)} \right)} \cdot {F\left( {\underset{\_}{u}}_{m} \right)}^{*}}}{F\left( \underset{\_}{g} \right)} \right)};$ ; and (l) despreading the result of step (k) to obtain the estimated data symbols {circumflex over (d)}.
 4. A wireless transmit/receive unit (WTRU) configured to perform the method of claim
 1. 5. A base station configured to perform the method of claim
 1. 6. A receiver configured to perform the method of claim
 1. 7. A method of recovering data comprising: computing a first column of a circulant matrix based on estimated channel response and noise power; decomposing a received vector circulant matrix in a fast Fourier transform (FFT) domain; decomposing a channel response circulant matrix in the fast FFT domain; reconstructing a received signal vector resulting in an extended signal vector; computing a composite spread signal vector; and despreading the composite spread signal vector.
 8. A wireless transmit/receive unit (WTRU) configured to perform the method of claim
 7. 9. A base station configured to perform the method of claim
 7. 10. A receiver configured to perform the method of claim
 7. 